Securing Personally Identifiable Information (PII) in DPS
Organizations are often required to use production data for development, testing, analytics, or training models for machine learning. This data can contain Personally Identifiable Information (PII), Protected Health Information (PHI), or data related to Payment Card Industry (PCI). Sensitive data is especially vulnerable during collection, transformation, transmission, or temporary storage. As a security best practice, DPS lets you mask and scramble sensitive data in order to ensure that it is exposed during data processing.
The Secure PII feature in Calibo Accelerate safeguards sensitive information and helps your organization maintain compliance. It prevents unintended data exposure when production data is used in lower environments, while still providing realistic, high-quality datasets for testing and analytics without legal or ethical risk. By embedding security principles into the core of the platform, it strengthens data governance and builds trust across the organization.
Industry use cases for securing PII
Here are some industry use cases where securing sensitive data is essential.
Banking/ Finance: Mask account numbers or scramble transaction references before sharing data with offshore testing teams.
-
Healthcare: Scramble patient identifiers during analytics runs while preserving the distribution of values for statistical accuracy.
-
Retail/ E-commerce: Mask credit card details or loyalty IDs while enabling recommendation engines to run on anonymized data.
-
Insurance: Scramble claims data fields so actuaries can model risks without exposing sensitive customer data.
Scrambling
Scrambling jumbles the characters in the selected column of a dataset without preserving the original structure or meaning. It helps randomize sensitive data while preserving the original length.
| Scrambling Strategy | Description |
|---|---|
| Full String Scramble |
Randomizes all characters in the entire string, including spaces and punctuations. Example
|
| Scramble Within Word |
Scrambles characters inside each word, preserving word boundaries and original word order. Example
|
| Scramble Word Order in Sentence |
Shuffles the order of words in a sentence, keeping characters within each word unchanged. Example
|
| Scramble Middle Characters of Each Word |
Scrambles only the middle characters of each word, keeping the first and last characters intact. Example
|
| Scramble N Random Pairs |
Swaps N random pairs of characters within the string. Example
|
| Scramble Every Nth Character |
Scrambles every nth character in the string while keeping others in place. Example
|
| Reverse String Scramble |
Reverse the entire string from end to start. Example
|
| Scramble Based on Word Length |
Scrambles only the words that meet the specified length condition. Example In this type, words longer than 5 characters are scrambled, while shorter words remain unchanged.
|
| Scramble N Percentage |
Scrambles a defined percentage of characters in the string. Example In this example 40% characters are scrambled - 4 characters
|
| Block Scramble |
Divides the string into blocks of fixed size and scrambles characters within each block. Example In this example, block size is 4 and characters are scrambled within each block.
|
| Scramble All Except First N Characters |
Scramble all characters except the first N characters. Example In this example the first 3 characters re kept intact and the rest are scrambled.
|
| Scramble All Except Last N Characters |
Scramble all characters except the last N characters. Example In this example the last 4 characters are kept intact and the rest are scrambled.
|
| Pattern-Based Scramble |
Scrambles only characters in a specified pattern. Maintains format-like patterns (e.g., phone numbers, license plates) while masking sensitive segments. Example
|
Masking
Masking replaces characters in the selected column with system-defined or custom characters, based on the chosen rule. It retains the original format and structure of the data while ensuring sensitive details remain hidden.
| Masking Strategy | Description |
|---|---|
| Mask Entire Value |
Replaces the entire value with a masking character. Example - hiding sensitive fields like tokens or passwords
|
| Mask Digits Only |
Masks only the numeric characters. Example
|
| Mask Every Nth Character |
Masks every nth character in the string. Example In this example we mask every 3rd character.
|
| Mask First N Characters |
Masks the first N characters in each value. Example In this example, we mask the first 4 characters.
|
| Mask Last N Characters |
Masks the last N characters in each value. Example In this example, we mask the last 8 characters.
|
| Mask All Except First N Characters |
Keeps the first N characters visible, and masks the rest of the characters. Example In this example, we show only the first 3 characters and mask the remaining.
|
| Mask All Except Last N Characters |
Keeps the last N characters visible, and masks the rest of the characters. Example In this example, we show the last 4 characters and mask the remaining characters.
|
| Mask Middle N Characters |
Masks a group of N characters in the middle of each value. Example In this example, we mask the 4 middle characters and show the remaining characters.
|
| Mask First N% Characters |
Masks the first N% of each value. Example
|
| Mask Last N% Characters |
Masks the last N% of each value. Example
|
| Pattern-Based Masking |
Masks based on user-defined regex-like patterns. Example
|
By applying scrambling or masking, you can protect sensitive information while still enabling realistic data for development, testing, and analytics — helping your organization maintain compliance and build customer trust.
| What's next? Data Integration using Unity Catalog |